Generalization error bounds for the logical analysis of data
نویسنده
چکیده
This paper analyzes the predictive performance of standard techniques for the ‘logical analysis of data’ (LAD), within a probabilistic framework. It does so by bounding the generalization error of related polynomial threshold functions in terms of their complexity and how well they fit the training data. We also quantify the predictive accuracy in terms of the extent to which there is a large separation (a ‘large margin’) between (most of) the positive and negative observations.
منابع مشابه
Robust cutpoints in the logical analysis of numerical data
Techniques for the logical analysis of binary data have successfully been applied to non-binary data which has been ‘binarized’ by means of cutpoints; see [8, 9]. In this paper, we analyse the predictive performance of such techniques and, in particular, we derive generalization error bounds that depend on how ‘robust’ the cutpoints are.
متن کاملThe performance of a new hybrid classifier based on boxes and nearest neighbors
In this paper we present a new type of binary classifier defined on the unit cube. This classifier combines some of the aspects of the standard methods that have been used in the logical analysis of data (LAD) and geometric classifiers, with a nearest-neighbor paradigm. We assess the predictive performance of the new classifier in learning from a sample, obtaining generalization error bounds th...
متن کاملA Method to Schedule Both Transportation and Production at the Same Time in a Special FMS
This report analyses the predictive performance of standard techniques for the ‘logical analysis of data’ (LAD), within a probabilistic framework. Improving and extending earlier results, we bound the generalization error of classifiers produced by standard LAD methods in terms of their complexity and how well they fit the training data. We also obtain bounds on the predictive accuracy which de...
متن کاملComputing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
One of the defining properties of deep learning is that models are chosen to have many more parameters than available training data. In light of this capacity for overfitting, it is remarkable that simple algorithms like SGD reliably return solutions with low test error. One roadblock to explaining these phenomena in terms of implicit regularization, structural properties of the solution, and/o...
متن کاملMulti-class SVMs: From Tighter Data-Dependent Generalization Bounds to Novel Algorithms
This paper studies the generalization performance of multi-class classification algorithms, for which we obtain—for the first time—a data-dependent generalization error bound with a logarithmic dependence on the class size, substantially improving the state-of-the-art linear dependence in the existing data-dependent generalization analysis. The theoretical analysis motivates us to introduce a n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Discrete Applied Mathematics
دوره 160 شماره
صفحات -
تاریخ انتشار 2012